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Abstract — We present the tree-structure expectation propa- 
gation (Tree-EP) algorithm to decode low-density parity-check 
(LDPC) codes over discrete memoryless channels (DMCs). EP 
generalizes belief propagation (BP) in two ways. First, it can be 
used with any exponential family distribution over the cliques in 
the graph. Second, it can impose additional constraints on the 
marginal distributions. We use this second property to impose 
pair-wise marginal constraints over pairs of variables connected 
to a check node of the LDPC code's Tanner graph. Thanks to 
these additional constraints, the Tree-EP marginal estimates for 
each variable in the graph are more accurate than those provided 
by BP. We also reformulate the Tree-EP algorithm for the binary 
erasure channel (BEC) as a peeling-type algorithm (TEP) and we 
show that the algorithm has the same computational complexity 
as BP and it decodes a higher fraction of errors. We describe 
the TEP decoding process by a set of differential equations that 
represents the expected residual graph evolution as a function 
of the code parameters. The solution of these equations is used 
to predict the TEP decoder performance in both the asymptotic 
regime and the finite-length regime over the BEC. While the 
asymptotic threshold of the TEP decoder is the same as the 
BP decoder for regular and optimized codes, we propose a 
scaling law (SL) for finite-length LDPC codes, which accurately 
approximates the TEP improved performance and facilitates its 
optimization. 



L Introduction 

Low-density parity-check (LDPC) codes are well known 
channel capacity-approaching (c.a.) linear codes. In his 
PhD [1], Gallager proposed LDPC codes along with linear- 
time practical decoding methods, among which the belief 
propagation (BP) algorithm plays a fundamental role. BP was 
later redescribed and popularized in the articial intelligence 
community to perform approximate inference over graphical 
models, see for instance [2], [3], [4]. Given a factor graph 
that represents a joint probability density function (pdf) p(V) 
of a set of discrete random variables [5], BP estimates the 
marginal probability function for each variable. It uses a local 
message-passing algorithm between the nodes of the graph. 
The complexity of this algorithm is linear in the number of 
nodes [2]. For tree-like graphs, the BP solution is exact, but 
for graphs with cycles, BP is strictly suboptimal [6], [7], [8]. 

Linear block codes can be represented using factor (Tanner) 
graphs [9], where the factor nodes enforce the parity check 
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constraints. For LDPC codes, the presence of cycles in the 
Tanner graph quickly decays with the code length n. For 
large block lengths, a channel decoder based on BP achieves 
an excellent performance, close to the bitwise maximum a 
posteriori (bit-MAP) decoding, in certain scenarios [3], [10]. 
Nevertheless, the bit-MAP solution can only be achieved when 
the code length, code density and computational complexity 
go to infinity [11], [12], [13]. 

The analysis of the BP for LDPC decoding over independent 
and identically distributed channels is detailed in [14], [15], 
in which the limiting performance and code optimization 
are addressed. For the binary erasure channel (BEC), the 
BP decoder presents an alternative formulation, in which the 
known variable nodes (encoded bits) are removed from the 
graph after each iteration. The BP, under this interpretation, 
is referred to as the peeling decoder (PD) [11]. In [16], the 
authors investigate the PD limiting performance by describing 
the expected LDPC graph evolution throughout the decoding 
process by a set of differential equations. The asymptotic per- 
formance for the BP decoder is summarized in the computation 
of the so-called BP threshold [10], [11], [16], [17], which 
defines the limit of its decodable region for an LDPC code. 

The analysis of BP decoding performance in the finite- 
length regime is based on the evaluation of the presence 
of stopping sets (SSs) in the LDPC graph [14], which can 
severely degrade the decoder performance. In [14], [18], the 
authors provide tools to compute the exact BP average perfor- 
mance. However, this task becomes computationally challeng- 
ing if the degree of irregularity or block length increases [11]. 
Alternatively, we can separate the contributions to the error 
rate of large-size errors, which dominate in the waterfall region 
[19], from small failures, which cause error floors [14]. Scaling 
laws (SLs) were proposed in [19], [20] to accurately predict 
the BP performance in the waterfall region. For the BEC, they 
are based on the PD covariance evolution for a given graph 
as a function of the code length. Covariance evolution was 
solved for any LDPC ensemble in [21]. On the other hand, 
the analysis of the error floor is addressed by determining 
the dominant terms of the code weight distribution [14], [18]. 
Precise expressions for the asymptotic bit-MAP and BP error 
floor are derived in [1], [22], [23]. 

Expectation propagation (EP) [24] can be understood as a 
generalization of BP to construct tractable approximations of 
a joint pdf p(V). Consider the set of all possible probability 
distributions in a given exponential family that map over 
the same factor graph. EP minimizes within this family the 
inclusive KuUback-Leibler (KL) divergence [25] with respect 
to piy). In [24], [26], it is shown that BP can be re- 
formulated as EP by considering a discrete family of pdfs 
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that factorizes as the product of single-variable multinomial 
terms, i.e. g(V) = ^1(^1)^2(^2) • • • gn(Ki)- EP generalizes 
BP in two ways: first, it is not restricted to discrete random 
variables. And second, EP naturally formulates to include 
more versatile approximating factorizations [27], [28]. In this 
paper, we focus on EP to construct a Markov tree- structure 
to approximate the original graph. Conditional factors in the 
tree- structure are able to capture pairwise interactions that 
single factors neglect. We refer to this algorithm as tree- 
structure expectation propagation (Tree-EP). We borrow from 
the theoretical framework of the Tree-EP algorithm to design a 
new decoding approach to decode LDPC codes over discrete 
memoryless channels (DMCs) and we analyze the decoder 
performance for the BEC. 

For the erasure channel, we show that the Tree-EP can 
be reinterpreted as a peeling-type algorithm that formulates 
as an improved PD. We refer to this simplified algorithm as 
the TEP decoder. The TEP decoder was presented in [29], 
[30], where we empirically observed a noticeable gain in 
performance compared to BP for both regular and irregular 
LDPC codes. We now explain, analyze and predict this gain 
in performance for any LDPC code. First, we extend to the 
TEP decoder the methodology proposed in [16] to evaluate 
the expected graph evolution of the LDPC's Tanner graph. As 
the block size increases, we show the conditions for which 
the TEP decoder could improve the BP decoder. Nevertheless, 
for typical LDPC ensembles the TEP decoder is not able to 
improve the BP solution. In the second part of the paper, we 
concentrate on practical finite-length codes and we explain 
the gain provided by the TEP decoder compared to BP. 
Based on empirical evidence, we propose a SL to predict 
the TEP performance for any given LDPC ensemble in the 
waterfall region, which captures the gain in performance that 
the TEP achieves with respect to BP for finite-length LDPC 
codes. Furthermore, the SL can be used for TEP-oriented 
finite-length codes optimization. Finally, we also prove that 
the decoder complexity is of the same order than BP, i.e. 
linear in the number of variables, unlike other techniques 
proposed to improve BP at a higher computational cost. For 
instance, we can mention variable guessing algorithms [31], 
the Maxwell decoder [12] and pivoting algorithms for efficient 
Gaussian elimination [32], [33], [34], whose complexity is 
not linear unless we impose additional restrictions that may 
alter/compronuse their performance, such as bounding the 
number of guessed variables or pivots. 

The rest of the paper is organized as follows. Section II 
is devoted to introducing the Tree-EP algorithm for block 
decoding over DMCs. In Section III, we particularize the 
algorithm for the BEC, yielding the TEP decoder. In Sec- 
tion IV, we derive the differential equations that describe the 
decoder behavior for a given LDPC graph and we investigate 
its asymptotic behavior as well as the algorithm's complexity. 
In Section V, we describe the scaling law proposed to approx- 
imate the TEP finite-length performance for a given LDPC 
ensemble in the waterfall region. We conclude the paper in 
Section VI. 



II. Tree-EP for LDPC decoding LDPC over 

MEMORYLESS CHANNELS 

Consider an LDPC binary code C with parity check matrix 
H, of dimensions k x n, where k = n(l — r), n is the code 
length and r the rate of the code. By definition, any vector v 
in ¥2 belongs to the code C as long as vH^ = 0, where ¥2 is 
the n-dimensional binary Galois field. Each row of H therefore 
imposes an even parity constraint on a subset of variables: 



Ci(v) 



^ Vi mod 2 = 



ieij 



Vj = l,...,k; (1) 



where v^ is the i-th component of v, Ij is the set of positions 
where the j-th row of H is one and ![•] is a boolean operator, 
which takes value one if the condition in its argument is 
verified. Note that, given the definition in (1), we can write 
l[vGC] = Ci(v)C2(v)...Ck(v). 

Assume that an unknown codeword is transmitted through 
a discrete memoryless channel [25] and let y G ^(y) be 
the observed channel output, where ^(y) is the channel 
output alphabet. A bit-MAP decoder [35] minimizes the bit 
error rate (BER) by estimating the transmitted vector v = 
[vi, V2, . . . , Vn] as follows^: 

v^ = arg max p{Vu = v\y) = arg inax p (V|y) 

vG{0,1} vG{0,1} t — 



VeC:Vu=v 



g max V p(y|V)p(V) 

n k 

s ^^^x E Xiv{vmX\^^A^) (2) 



for = 1, . . . , n, where we have assumed that the channel is 
memoryless and that all codewords are equally probable 



p(y) 



i[v eC] 



(3) 



For most LDPC codes of interest, the factor graph associated 
to the product p(y|V)p(V) in (2) yields a graph with 
cycles [36]. Hence, the exact computation of the marginals 
of p{Vu = v\y) grows exponentially with the number of 
coded bits [6]. Belief propagation [1], [2], [3] is nowadays the 
standard algorithm to efficiently solve this problem in coding 
applications, because accurate estimates for each marginal are 
obtained at linear cost with n. Besides, BP can be cast as an 
approximation of p(V|y) in (2) by a complete disconnected 
factor graph, i.e. 

n k n 

P(V|y) cx HpiVilVi) n Ci(V) ^HqiMVi}^ (4) 

1=1 j = l 1=1 

where gi,Bp(^i) is the BP estimate for the i-th variable [6], 
[37], [38]. 

^In the following, we use lower case letters to denote a particular realization 
of a random variable or vector, e.g V = v means that V takes the v value. 
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A. Tree-EP algorithm for LDPC decoding 

The Tree-EP algorithm [27], [39] improves BP decoding 
because it approximates the posterior p(V\y) in (2) with a 
tree (or forest) Markov- structure between the variables, i.e.: 



^(V) = ng.m|ypJ, 



(5) 



where the set of pairs {{i,Pi)^=i} forms a tree graph over the 
original set of variables nodes and qi{Vi\Vp.) approximates the 
conditional probability piYilVp^^y). For some variable nodes 
Vp. might be missing, i.e. pi is empty, and we use single- 
variable factors qi{Vi) to approximate them. 

Using the approximation in (5), the complexity of the 
marginalization in (2) is linear with the number of variables. 
In [39], it is shown that the approximation in (5) provides 
more accurate estimates for the marginals of p{Vu\y) than 
BP, since it captures information about the joint marginal of 
pairs of variables that are then propagated through the graph. 
Indeed, note that the factorization in (4) is just a particular 
case of (5), because EP converges to the same solution as the 
BP [24], [26], when we consider a completely disconnected 
graph, i.e. Pi = 9 Vi. 

Consider a family of discrete probability distributions that 
factorize according to (5), denoted hereafter by Ttree- The 
optimal choice for g(V) G J^tree^ denoted by g(V), is such 
that it minimizes the inclusive KL divergence^ with p(V|y): 

g(V) = arg min i^^L (p(V|y)| |^(V)) . (6) 

The next lemma, first proved in [40], states that the reso- 
lution of the problem in (6) is as complex as the bit-MAP 
decoding problem in (2). In both cases we have to perform 
exact marginalization over the posterior distribution p(V|y). 

Lemma 1: (Moment matching for inclusive KL-divergence 
minimization). Consider a set of discrete random variables 
V with joint pdf p(V). Let be a family of probability 
distributions that share a common factorization: 



(7) 



for some normalized functions gi(V^), where is a subset 
of V, for i = 1, . . . , L. Under these conditions, the function 
q{y) in J^* such that 

g(V)=arg ^ min i^^L (p(V)||g(V)) (8) 
is constructed as follows: 



L 



^We refer to the inclusive KL divergence with respect to p when we 
minimize Dkl{p\\q) and to the exclusive KL divergence when we minimize 
Dkl^^Wp), which is the one used for mean-field approximations, see [39] 
for a discussion. 



where V ~ denotes all the variables in V except those in 
V,. 

Proof: The proof of this lemma can be found in [40], 
[41]. ■ 

Lenmia 1 can be directly applied to the problem in (6) and 
the optimum Markov-tree g(V) is such that 

%{Vi\Vj„)= ^ =p{Vi\Vp„y) (11) 



for i = 1, 



,n and 



m = U^i{Vi\Vp,). 



(12) 



The marginal computation in (11) is of the same complexity 
order as (2). Lemma 1 only provides the conditions to find the 
distribution qCV) in the family Ttree that is optimum in the 
sense of (6). A different problem arises from the optimization 
of the family J^tree itself, namely how we choose the parent 
variables Vp., to achieve the highest accuracy for the least 
cost. As discussed in [39], while determining the cost of a 
given choice is straightforward, estimating the accuracy for 
each case is a non-trivial problem that highly depends on the 
distribution p(V) and the application at hand. In our case, 
the analysis of the Tree-EP decoder for the erasure channel in 
Section III provides the intuition to construct the family Ttree 
for any DMC. To describe the implementation of the Tree-EP 
algorithm, in the following we assume fixed the family J-'tree- 
The Tree-EP algorithm overcomes this problem by iter- 
atively approaching g(V) as follows. Define g^(V) as the 
EP approximation to q{V) at the end of the ^-th iteration 
and let q^(V) ^ ^(V) be the Tree-EP solution after con- 
vergence^. Let wlj{Vz,VpJ be non-negative real functions, 
with z = 1, . . . , n and j = 1, . . . , k, that are updated at each 
iteration so that 

n 

Ci(V) « Wj{V) = n wijiV,, VpJ, (13) 

z=l 

Thus, the Tree-EP approximation to the posterior 2?(V|y) at 
iteration i, i.e g^(V), is constructed as follows: 



z=l 



z=l j=l 

The Tree-EP is described in Algorithm 1. At iteration £, 
only the m-th factor, W^CV), is refined. In Step 7 we replace 
^m"HV) by the true value C^(V) in the q^-^{V) function. 
The resulting function is denoted by f(V,i,m). Then, by 



^EP as BP might not converge for loopy graphs [39]. 
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Lemma 1, in Steps 8 to 10 we compute q^iy) as the solution 
of the following problem 

q\\) = 3ig ^ mm i^^L (/(V, ^, m)||g(V)) . (15) 

At this point, the TEP solution becomes suboptimal with 
respect to (6) but tractable: the computation of the marginals 
^^iWvi) for i = 1, . . . , n over /(V, ^, m) in (20) and (21) can 
be performed efficiently. Let us first express the factorization 
of /(V,£,m) in a more convenient way: 



/(V,£,m) = C„(V)- 



n k 

c„,(v)np(2/.ii4) n^/"'(v) 

2=1 j=\ 

n 

z=l 



(16) 



where we have introduced the following auxiliary functions 

9i-'^'^{vM=p{yz\v.) n <-'(^-^pJ- (17) 

Therefore, the marginalization of (16) in (20) yields 

n 

As in (13), the product YTz=i9z~^'^{^z^ypz) niaps over 
the same factor graph than the tree structure chosen in (5). 
Therefore, the presence of cycles in the factor graph of 
/(V,^, m) is due to the parity factor C^(V). The graph is 
cycle-free as long as, among the variables connected to the 
parity check node Cm(V), none of them are linked by a 
conditional term q{Vi\Vp^) in (5), as illustrated in Fig. 1(a). 
Otherwise the graph presents cycles, as shown in Fig. 1(b). 
These cycles play a crucial role in understanding why the 
Tree-EP algorithm outperforms the BP solution. In the first 
case, the marginal computation in (20) is solved at linear cost 
by message-passing. For the latter, where the graph is not 
completely cycle-free, we can compute the pairwise marginals 
using Pearl's cutset conditioning algorithm [27], [42]. 

Pearl's algorithm proceeds by breaking each cycle assuming 
a set of the variables involved as known, e.g. Vq in Fig. 
1(b). Then, the marginals of the remaining variables can 
be computed at low-cost by message-passing. The overall 
complexity of this method is exponential with the number of 
assume-observed variables. However, we prove that for the 
BEC the complexity of the Tree-EP algorithm is linear with 
the number of variables, i.e. of the same order as BP. 

III. Tree-EP decoding for the BEC 
For the BEC, the likelihood function for a particular variable 
p{yi\yi)^ provides a complete description about its value 
when it has not been erased: 



p{yi = l\Vi = l) = l-e, p{yi = l\V, = {))-- 
p{yi = 0\Vi = 1) = 0, pivi = 0\Vi = 1) = 1 



0, 



(23) 
(24) 



Algorithm 1 Tree-EP algorithm for a predefined tree structure. 
1: £ = 0. 

2: Initialize W^{\) = 1 for j = 1, . . . , k. 

n 

3: Initialize q%V) = l[p{yi\Vi). 

i=i 

4: repeat 

5: £ = £^1. 

6: Chose a W^(V) to refine: m = mod(^, k). 
7: Remove W^~^(V) from g^~^(V) and include the true 
term C^(V): 

/(V,^,H = C™(V)^^. (19) 

8: Compute qliVi, V^.) and (V^) for i = 1, . . . ,n: 

ql{Vi,Vp,)^ J2 f{^J,m), (20) 
v~{Vi,ypj 

QKVi) = J2QfiVi^Vp.)- (21) 
jiVAV^A = '^'^Y!:^^:'' for i = 1, . . . ,n. 



9: Compute qiiVi\Vp.) 



10: qKY) = X{ql{Vi\V^,). 

11: Compute wl^{Vu Vj,,) from q^{\). By (14), 

4{Vi\v,,) 



p{yi\Vi)X{w^\vM 



(22) 



for i = 1, . . . , n. 

n 

12: W-i(V) = n<™(Vi,T/pJ. 

13: until all Wjiy), j = 1, . . . , k, converge. 

14: ^(V) ^q^{\). 



In this case, we say that the variable is known. Otherwise, 
when the variable is erased, the likelihood function for this 
variable is constant and the uncertainty about its value is 
complete: 



p{yi = l\Vi = l)=p{y, = l\Vi = ^) = e. 



(25) 



This two-state behavior of the likelihood function makes 
the pairwise marginal functions qfiyi^p.) in (18) and (20) 
alternate between just four states, depending on what we 
know about these variables at the end of the ^-th iteration. 
Furthermore, we can describe a finite set of scenarios for 
which qfiVi.Vp.) might alternate between these states. This 
result is used later to propose a simplified reformulation of the 
decoding algorithm. Let us detail the possible outcomes and 
cases of interest by running the algorithm at different times, 
after initialization. 
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(b) 

Fig. 1. Example of factor graph associated to /(V, ^, m). In (a), the graph 
is cycle free since Cm does not connect any variable Vi with its parent Vp- . 
In (b), the graph has a cycle since both Vr and Vp^ = Vo belong to Im- 



Iteration 1 = 1 . In Step 7, we compute 

n 

f{\,l= l,m = 1) = Ci(V) n^y.!^.), (26) 

Z = l 

where the HkeHhood terms p{yz\Vz) for z = 1, . . . ,n take 
the form (23)- (25). Over (26), we compute the marginals 
of {{Vi.Vp.)'^^^}. For any pair {Vi.Vp.), we observe the 
following scenarios: 

• If Vi and Vp. have not been erased, then marginalization 
yields: 

= l[Vi = vi,Vp,=V2], (27) 
where vi and V2 are, respectively, the values of Vi and 

• If Vi is erased and Vp. is not, then the marginalization 
in (26) might reveal the value of Vi. This scenario only 
happens if Vi is connected to the check Ci(V) and, in 
addition, the rest of variables connected to this check are 
known. In this way, the parity restriction fixes the value 
of the variable and the marginalization yields the same 
result than in (27). Otherwise Vi remains unknown: 

QhVi,Vp,) = h[Vp,=V2]^ (28) 



where V2 is the value of V^.. For instance, assume that, 

in Fig. 1 (a), Vo and Vp are known. When we compute 

the marginal for Vs, this variable gets revealed. 

The case where Vp. is erased and Vi is not is synmietric 

to the previous scenario. 

If both Vi and Vp. are erased, we compute 



QhVi^Vp,)^ ^ ci(v) n piy^\yz) = i/^ 

(29) 

for any (Vi.Vp.) pair unless these two variables are the 
only unknown variables connected to the check Ci(V). In 
this case, due to the parity constraint, only one value of 
Vi + niakes qjiVi^ Vp.) non-zero. In other words, an 
equality or inequality relationship between both variables 
is found and qjiYi^Vp.) either takes the form of 



or 



qHvM = -i[Vi = Vp,] 



qHVi,Vp,) = -i[Vij^Vp,]. 



(30) 



(31) 



For instance, assume that, in Fig. 1(b), Vp and Vg are 
known. Then, we learn that Vr is equal or opposite to 
Vo. 

The latter is of importance to explain the advantage of the 
TEP decoder over BP, because we only obtain the result in (30) 
or (31) if we compute pairwise marginals. The BP algorithm 
uses only a disconnected approximation for the factors and 
from them we cannot derive the (in)equality constraints in 
(30) and (31). 

It can be readily check for the BEC that when we compute 
the functions qi{Vi\Vp.) and wf^{Vi^ Vp.) in Steps 9, 10 and 
11 of the algorithm, they are proportional to the pairwise 
marginal computed above^: 



qf{Vi\Vp,)^qf{Vi,Vp,), 

,m{Vi.Vp,)^q^{Vi,Vp,). 



(34) 
(35) 



Iteration L We follow a induction procedure to analyze 
the result of the ^-th iteration. Given the factorization of 
the function /(V,^, m) in (16), the current information of 
each variable is contained in the ^f~^'^(14, Vp^) functions for 
2; = 1, . . . , n in (17). By replacing (35) into (17), we observe 
that these functions are also proportional to q^~^{yz^Vp^). 
Therefore, 

n 

f{Y,l,m)^Cm{Y)^qi-\V,,Vp^). (36) 

Z=l 

^For instance, if we assume that qf{Vi,Vp^) is of the form of (27), we 
first compute q{Vp^): 

" (32) 



Vi 



and, therefore. 



M^Pi = ^i] 

where we take qf{Vi\Vp,) = when q{Vp,) = 0. 
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By enumerating the possible outcomes of the marginal 
qfiYi^Vp.) over /(V,£, m) in (36), we conclude that the 
discussion for £ = 1 extends for this case almost literally. 
By induction, we have proved that the functions qfiVi^Vp.) 
only belong to one of the four states described in (27)-(31). 

Nevertheless, for any £ > 1, we reveal a new scenario 
for which a variable can be de-erased, thanks to the imposed 
(in)equality pairwise condition in (30) and (31). Assume that 
at iteration £ — 1, we learn that Vr and Vo are equal and at 
iteration £ we process the check node C^(V) depicted Fig. 1 
(b). If Vs = 0, then the erased variable Vp should be zero to 
fulfill the parity constraint. This scenario is not possible if we 
cannot capture equality relationships, which are void for the 
BP decoder. Therefore, the tree structured approximation in 
(5) provides with respect to the BP procedure an extra case in 
which a variable can be de-erased. The key aspect is to find 
(in)equality conditions between variables that can be used to 
decode other variables related to them by parity functions. 
This depends on the family Ttree that it is used to decode the 
received word. 

The Tree-EP procedure for BEC presented next can be 
cast as the sequential search of three particular scenarios to 
perform the inference (e.g. build Ttree), where the only thing 
that matters is the number of unrevealed variables in the 
processed check node. We can further simplify and reformulate 
this procedure as a peeling-type decoder. The key idea is to 
simplify the LDPC Tanner graph according to the information 
that we sequentially obtain from the encoded bits. We rename 
this reformulated algorithm as the TEP decoder [29], [30]. 

A. The TEP decoder 

Before detailing the TEP decoder algorithm, let us introduce 
some basic definitions about Tanner graphs. The Tanner graph 
of an LDPC code is induced by the parity-check matrix H as 
detailed in [9], [35]. The graph has n variable nodes , . . . , K 
and k = ii(l — r) parity check nodes Pi, . . . , Pk. The degree of 
a check node, denoted as dp., is the number of variable nodes 
connected to it in the graph. Similarly, the degree of a variable 
node, denoted as dy. , is the number of check nodes connected 
to that variable node. In the Tanner graph, we associate a zero 
parity value to each check node in the graph. As we iterate, the 
parity of a certain check node Pj, which is denoted hereafter 
as [Pj], might change. 

The TEP decoder is detailed in Algorithm 2. It is based on 
the sequential procedure of degree-one and two check nodes 
in the graph. Processing a degree-one check node in Steps 
10-12 of Algorithm 2 is equivalent to find an erased variable 
connected to a check node where the rest of variables are 
known. Since the BP solution is restricted to this case, the 
description of the BP as a peeling-type algorithm is obtained 
if we do not consider Steps 13-16 in Algorithm 2 [16], 
[43]. In this sense, the TEP decoder emerges as an improved 
PD. Besides, we claim that the complexity of both decoders 
is of the same order, i.e. 0(n). We intentionally leave to 
Section IV-E a detailed analysis of the TEP complexity. 

The removal of a degree-two check node in Steps 13-16 
of Algorithm 2 represents the inference of an equality or 



Algorithm 2 TEP algorithm for LDPC decoding over BEC. 



1: 


Let y be the received codeword: y G {0, ?, 1}^. 


2: 


Construct the index set ^ : Vs G ^ then 7^?. 


3: 


for all s G ^ do 


4: 


Remove from the graph variable node Vg. 


5: 


If i/s = 1, flip the parity of the check nodes connected 




ioVs. 


6: 


end for 


7: 


repeat 


8: 


Look for a check node of degree one or two. 


9: 


if Pj is found with degree one, connected to y^,. 




then 


10: 


Vs is decoded with value [Pj]. 


11: 


Remove both Vs and Pj from the graph. 


12: 


If \Pj\ = 1, flip the parity of the check nodes 




connected to V^. 


13: 


else if Pj is found with degree two, connected to Vq 




and Vr, then 


14: 


Remove Pj and Vq from the graph. 


15: 


If [Pj] = 1, flip the parity of the check nodes 




connected to Vq. 


16: 


Reconnect to Vr the check nodes connected to y^. 


17: 


end if 


18: 


until the graph is empty or there are no degree-one or two 




check nodes in the graph. 



inequality relationship. This process is sketched in Fig. 2. The 
variable Vi heirs the connections of V2 (solid lines) in Fig. 
2(b). Finally, the check Pi and the variable V2 can be removed, 
as shown in Fig. 2(c), because they have no further implication 
in the decoding process. V2 is de-erased once Vi is de-erased. 
Note that, when we remove a check node of degree two, we 
usually create a variable node with a higher degree while the 
degree of the check nodes remain unaltered. 

The TEP decoder eventually creates additional check nodes 
of degree one when we find a scenario equivalent to the one 
depicted in Fig. 3. Consider two variable nodes connected to a 
check node of degree two that also share another check node 
with degree three, as illustrated in Fig. 3(a). After removing 
the check node P3 and the variable node V2, the check node 
P4 is now degree one, as illustrated in Fig. 3(b). At the 
beginning of the decoding algorithm, this scenario is very 
unlikely. However, as we remove variable and check nodes, the 
probability of this event grows, as we are reducing the graph 
and increasing the degree of the remaining variable nodes. 

Another important result of the TEP algorithm is that it 
applies the Tree-EP algorithm with no initial pairwise rela- 
tions. The tree structure is not fixed a priori. It dynamically 
includes a new pairwise relation in the tree structure whenever 
it processes a degree-two check node, i.e. it updates Ttree 
on the fly. In Appendix A, we show that the TEP decoding 
is independent of the ordering in which the variables are 
processed, as different processing orderings yield equivalent 
trees in the graph. 
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(a) 




(c) 

Fig. 2. In (a) we show two variable nodes, Vi and V2, that share a check 
node of degree two, Pi . In (b), Vi heirs the connections of V2 (solid lines). 
In (c), we show the graph once Pi and V2 have been removed. If Pi is parity 
one, the parities of P2, P3 are reversed. 





(a) 



(b) 



Fig. 3. In (a), the variables Vi and V2 are connected to a two-degree check 
node, P3, and they also share a check node of degree three, P4. In (b) we 
show the graph once the TEP has removed P3 and V2. 



B. Connection to previous works 

A similar procedure for removing degree-two clieck 
nodes was considered in the analysis of accumulate-repeat- 
accumulate (ARA) LDPC codes [44], [11]. ARA codes were 
proposed to achieve channel capacity under BP decoding 
at bounded complexity. Roughly speaking, ARA codes are 
formed by the concatenation of an accumulate binary en- 
coder, an irregular LDPC code and another accumulate binary 
encoder. In [45], [46], Pfister and Sason showed that ARA 
codes can be described for BEC as an equivalent irregular 
LDPC ensemble and hence they were able to compute the 
ARA BP threshold using standard techniques [10]. To obtain 
such equivalent LDPC ensemble, parity checks of degree two, 
corresponding to the accumulate encoding, were processed 
similarly to how the TEP decoder processes degree-two check 
nodes. Once they obtained the equivalent irregular LDPC 



ensemble, they just consider BP decoding. Nevertheless, we 
want to emphasize that the novelty of our proposal is twofold: 
first, we consider, describe and measure the effect of the 
removal of degree-two check nodes to improve the BP solution 
for any block code. And second, we have shown that the idea 
of propagating pairwise relationships can be extended for any 
binary DMC using the EP framework. 

IV. TEP DECODER EXPECTED GRAPH EVOLUTION 

Both the PD and the TEP decoder sequentially reduce the 
LDPC Tanner graph by either removing check nodes of degree 
one, or degree one and two. As a consequence, the decoding 
process yields a sequence of residual graphs. In [43], [16], it 
is shown that if we apply the PD to elements of an LDPC 
ensemble, then the sequence of residual graphs follows a 
typical path or expected evolution [11]. The authors described 
this path as the solution of a set of differential equations and 
characterized the typical deviation from it. Their analysis is 
based on a result on the evolution of (martingale) Markov 
processes due to Wormald [47]. 

In this section, we first introduce the Wormald' s theorem 
and then particularize it to compute the expected evolution of 
the residual graphs for the TEP, which is used in the following 
to evaluate the decoder performance. 

A. Wormald 's theorem 

Consider a discrete-time Markov random process Z^^\t) 
with finite d-dimension state space A{ZY that depends on 
some parameter a > 1. Let z\^\t) denote the i-th component 
of for i G {1, ... ,4. Let P be a subset of 

containing those vectors [0, zi, . . . , Zd] such that: 



P 



Zi,l <i <d \ > 0, 



(37) 



We define the stopping time tj^ to be the minimum time so that 
{t/a, Z[''\t)/a, . . . , Z^/\t)/a) ^ V. Furthermore, let fi{-) 
foT 1 < i < d, be functions from M^+^ M such that the 
following conditions are fulfilled: 
1) (Boundedness). There exists a constant O such that for 
Sill i < d and a > 1, 



zt\t) 



(38) 



for all < t < t-D^ 
2) (Trend functions). For sll 1 <i < d and a > 1, 



zf^(t + i) -zf^(t) Z^^\t) 

= f,{r,z[^\t)/a,...,Z^;\t)/a) 



(39) 



for all < t < tT>, where t = |. 
3) (Lipschitz continuity). For each i < d, the function /^(•) 
is Lipschitz continuous on the intersection of V with the 
half space {[t, zi, . . . , Zd] : t > 0}, i.e., if 6, c G M^+^ 
belong to such intersection, then there exists a constant 
u such that 

\Mb)-Mc)\<,,J2\^,-cj\. (40) 
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Under these conditions, the following holds: 
• For [0, 61, ... , bd] G V, the system of differential equa- 
tions 



dr 



has a unique solution in V for Zi{T) : 
^.(0) = bi,l<i<d. 
• There exists a constant S such that 



1, . . . 



(41) 
with 



p{\zt\t)/a- Zi{t/a)\ >Sa-^^ < dai exp{-a^ /2), 

(42) 

for i = 1, . . . , d and for < t < tx>, where Zi{T) is the 
solution given by equation (41) for 



z,{0)=E[Z^^^\t = 0)/a]. 



(43) 



The result in (42) states that each realization of the process 
z\^\t) has a deviation from the solution of (41) smaller than 
(9(a~^/^) if a is large enough. Our goal is to show that this 
theorem is suitable to describe the LDPC graph evolution 
during the TEP decoding process in certain scenarios. 

B. LDPC ensembles and residual graphs 

In this subsection, we introduce some basic notation 
about LDPC ensembles. An ensemble of codes, denoted by 
LDPC[A(x), p(x), n], is defined by the code length n and the 
edge degree distribution (DD) pair (A(x),p(x)) [10]: 



A(x) = £a,x^-\ 



i=2 

^max 



(44) 



(45) 



i=2 



where A^ represents the fraction of edges with left degree i in 
the graph and pj is the fraction of edges with right degree j. 
The left (right) degree of an edge is the degree of the variable 
(check) node it is connected to. The graph is specified in 
terms of fractions of edges, and not nodes, of each degree; this 
form is more convenient to analyze the convergence properties 
of the decoder. If the total number of edges in the graph is 
denoted by E, it can be readily checked that 

^=xArr' ^46) 

The design rate of the LDPC ensemble r = r(A(x),p(x)) 
is set as follows [11]: 



e 



(47) 



avg 



where Aavg and ©avg are, respectively, the average variable 
degree and the average check node degree in the graph. They 
can be computed from the graph DD: 

^ (48) 



^avg 



€^avg — 



/o A(^)d^' 
1 



(49) 



To analyze the expected graph evolution, each time step 
corresponds to each step of the decoder. Li{t) and Rj{t) 
are, respectively, the number of edges with left degree i 
and right degree j in the residual graph at time t and we 
define kit) = Li(t)/E and ri{t) = Ri{t)/E. We denote by 
E{t) the number of edges in the graph at time t and define 
e{t) = E{t)/E. Hence, li{t)/e{t) for i = 1, . . . , /niax(^) and 
rj{t)/e{t) for j = 1, . . . , r^ax are the coefficients of the DD 
pair that defines the graph at time t. As we show in the next 
subsection, an small fraction of degree-one variable nodes 
might appear during the decoding process. Note that we have 
included an explicit dependency with time in /max(^)- As we 
described in Section III-A, the removal of degree-two check 
nodes tends to increase the variable degree and, consequently, 
we expect the maximum left degree to grow. 

The remaining graph at time t+1 only depends on the graph 
at time t and, hence, the sequence of graphs {L{t)^ along 
the time is a discrete time Markov process, where L{t) = 
{Li{t)y^^^^\ R{t) = {Rj{t)Yj'^\. It can be shown that the 
DD sequence of the residual graphs constitutes a sufficient 
statistic [16], [19] for this Markov process and, therefore, it 
suffices to analyze their evolution. 

For a given LDPC ensemble and a BEC with parameter e, 
the TEP decoder performance is analyzed and predicted using 
the expected evolution of ri{t) and r2{t) along time. In this 
section, we first identify their dependence with the rest of 
the components of the DD in the graph. Then, we show the 
conditions in which they can be estimated along time using 
Worlmald's theorem. 



C. Expected graph evolution in one TEP iteration 

We analyze the average evolution of the DD pair 
{L{t)^R{t)) in one iteration of the TEP decoder, i.e. 



E[L(t^l)-L{t)\L{t),R{t)], 
E[R{t^l)-R{t)\L{t),R{t)]. 



(50) 
(51) 



At time t, the TEP looks for a check node of degree one or 
two to remove it. With probability 



Pc{t) = 



Ri{t) 



Ri{t)^R2(t)/2 



(52) 



the decoder selects a check node of degree one, which is 
denoted as Scenario Si. Alternatively, in Scenario S2, a check 
node of degree two is removed with probability pc{t) = 
1 —Pc{t). The expected change in the process {L{t),R{t)) 
between t and t-\-l can be expressed as follows: 

E\Li{t ^ 1) - Li{t) L{t),R{t)\ = (53) 
PC {t) E\Li{t^l)-Li {t) L{t),R {t) , Si 



p^{t)E Li{t^l)-Li{t) L{t),R{t),S2 



II 
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(a) 



(b) 



Fig. 4. Examples of Subscenarios in (a), given the check node Pj the 
variable nodes Vo and Vr only share the check node Pj, and <S^ in (b), in 
which they share another check node of degree dp^. 



for i = 1, . . . ,lmax{t) and 



E 



Rj{t^l) -Rj(t) L(t),R(t) 



PC WE Rj{t + l)-Rj{t) L{t),R{t),Si 



(54) 



/// 



^pd{t)E Rj{t^l)-Rj{t) L{t),R{t),S2 



IV 

for j = l,...,rniax- In the following, we omit the pair 
{L(t), R{t)} in the expectations to keep the notation unclut- 
tered. 

The terms / in (53) and /// in (54) correspond to one 
iteration of the BP decoder and they were already computed 
in [16]. We include them for completeness: 



E^L,{t ^ 1) - L,{t) 

for Z = 1, . . . ,/max(t), 



Si 



e(t) 



E 



Rj{t + 1)-Rj{t) 



Si 



(55) 



(56) 



e(t) e(t) 



(/avg(t)-l)-^(j-l) 



for j = 1,. 
function and 



X, where 6{j) is the Kronecker's delta 



^max (^) 

(t)= J2 ih{t)/e{t) 



(57) 



i=l 



is the average edge left degree at time t. We now compute the 
terms // and IV in (53) and (54), respectively. When a check 
node of degree two is removed, e.g., Pj in Fig. 4, there are 
two possible subscenarios: 

• ^2^: The variable nodes Vq and Vr connected with the 
check node Pj do not share another check node, depicted 
in Fig. 4(a). 

• S2' The variable nodes Vo and Vr connected with the 
check node Pj share at least another check node, depicted 
in Fig. 4(b). 

Let psit) be the probability of scenario 5^, i.e., psit) = 
p (5<f 152). In Appendix B, we show that this probability is 
given by: 

(Zavg(*) - l)'(ravg(t) - 1) 



PB{t) = 



e{t)E 



(58) 



where 

^max 

r..,{t) = ^jrj{t)/e{t) (59) 

is the average edge right degree. In Appendix B, we also show 
that the scenario S2 is dominated by the case in which the two 
variables connected to a check node of degree two only share 
another check node, as illustrated in Fig. 4(b). To compute // 
and IV in (53) and (54), we first evaluate the graph expected 
change for S2 and S2 and then average them using psit)'- 



E[L,(t + 1) - L,{t)\S2] =PBm [Liit^l) - 
+ p^(t)E [L,{t + l)-L,{t)\S^], 

E[Rj{t + l)-Rj{t)\S2] =PB{t)E [Rj{t^l) 
^p^m [Rj(t^l)-Rj{t)\S^], 



■m\si] 

(60) 

-RMSi] 
(61) 



where psit) = l—pB(t). Now we evaluate each term in (60) 
and (61). 

1) Expected change in the graph assuming S2- At time t, 
we remove the check node Pj in Fig. 4(a), which is connected 
to Vo and Vr. If Vo is the remaining variable, its degree 
becomes dy^ + dy^ — 2. From the edge perspective, the graph 
losses dv^ edges with left degree dy^ and dy^ edges with left 
degree dy^, and gains dy^ + dy^ — 2 edges with left degree 
dvo + dy^ ~ 2. Note also that we have the same result if Vq 
is the remaining variable. The node degrees dy^ and dy^ are 
asymptotically pairwise independent [48] and, thus, Vo and Vr 
are degree i with probability /i(t)/e(t) for i = 1, . . . , /max(^)- 

We first focus on the evolution in the number of edges with 
left degree i > 3 at time t + 1, i.e. Li{t -\- 1). The sample 
space of Li{t + 1) — Li{t) is given by: 

Li{t + 1) - Li{t) e {-i, -2i, +i, 0} i > 3, (62) 

where 

. Li(t + 1) - Li{t) = -i, if {dy^ = i XOR dy^ = i) AND 

dy, ^dy^-2^i. 
. Li{t + 1) - Li{t) = -2z, if dy^ = dy^ = i. 
. Lilt + 1) - Li{t) = +i, if dy^ ^ i, dy^ ^ i AND dy^ + 

dy^ — 2 = i. 
• Li{t-\-l) — Li{t) = 0, otherwise. 

The probability associated to each case can be easily eval- 
uated. Finally, the expected change in the number of edges 
with left degree yields: 



Li{t ^ 1) - Li{t) 



kit) hit) 



9=1 



e{t) e{t) 

lq{t) l(i-q)+2{t) 

e{t) e{t) 



2i 



kit) 
e(i) 



(63) 



For i = 2, the sample space reduces to L2(i + 1) — L2{t) € 
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{—2, +2, 0} and, it is straightforward to show that: 



E 



L2it + 1) - L2{t) 



e{t) 
For the case i 



e{t) 
1, we have 



-2 



h{t) 
e{t) 



+ 2 



hit) h{t) \ 
e{t) e{t) J • 
(64) 



E 



Li{t + 1) 

Mi) 



Li{t) 



hit) 



hit) 



(65) 



e(t) V J V e(0 

Note that degree-one variable nodes are not created in 
both scenarios Si and 82- Regarding the edge right degree 
distribution, only two edges of right degree two are lost and, 
hence. 



E 



R2it ^ 1) - R2it) 
R^{t^l)-R,it) 



0, j ^ 2. 



(66) 
(67) 



2) Expected graph evolution assuming 5^; We study now 
the scenario depicted at Fig. 4(b), where the variables Vq and 
Vr are also linked to another check node Pi of degree dp^^ . In 
this case, the degree of the remaining variable is dy^ + dy^ — 4 
and the check node Pi losses two edges and its degree reduces 
to dp^ — 2. On the left side, the graph losses dy^ edges with 
left degree dy^ and dy^ edges with left degree dy^, and gains 
dvo + dy^ — 4 edges with left degree dy^ + dy^ — 4. 

For a given degree i 7^ 4, L^(t + 1) — Liit) takes value in 
the set in (62). The possible combinations of dy^ , dy^ and the 
associated L^it + 1) — L^it) values are now as follows: 

• Liit-\-l) - Liit) = -i, if idy^ = i XOR dy^ = i) AND 
dy^ + dy^ — 4 ^ i. 

. Liit + 1) - Liit) = -2i, if dy^ = dy^ = I. 
. Liit + 1) - Liit) = +i, if dy^ ^ i, dy^ ^ i AND dy^ + 
dy^-4. = i. 

• Liit -\- 1) — Liit) = 0, otherwise. 

The expected value of I/^(t + 1) — L^it) for i 7^ 4 is given 



by: 



E 



Li{t + 1) - Li{t) 
hit) 



-.^2fl-^-^ 
eit) \ e{t) e{t) 

i+3 



2i 



kit) 
eit) 



Iqit) l{i-q)+Ait) 

eit) eit) ' 



(68) 



and, for the case i = 4 we obtain: 



E 



L4(t + 1) -L4(t) 



kit) 

e{t) 



kit) 
eit) 



i+3 



+4 E 



q=l 



/g(t) ^(i-g)+4(^) 

e(t) eit) 



(69) 



The values of i for which the number of edges involved in 
(63) and (68) is larger than the number of edges in the graph 
are not allowed. We set a zero probability for them. We do 
not enumerate the complete list of these combinations for the 
sake of the readability of the section. 

The importance of scenario S2 lies on the fact that the 
check node Pi losses two edges and its degree reduces to 
dpi — 2. Therefore, check nodes with right degree one can 
be created. Since the check node has degree dp^ = I with 
probability r/(t)/e(t), it can be shown that: 



E Rjit^l) - Rjit) 
for j > 2, 
E[i?2(t+1) 



5f 



and 



E 



R2it) 



Rlit ^ 1) - Rlit) 



rj+2it) _ rji^ 
eit) eit) 



r^it) r2(t) 
eit) eit) 



rsit) 
eit)' 



(70) 



2, (71) 



(72) 



With the results in (63)-(72) and the probability psit) in 
(58), we are able to compute the terms // and IV in (53) and 
(54), obtaining the expected graph evolution in one iteration 
of the TEP decoder. It is important for the following analysis 
to note that, in any possible scenario, Riit) and i?2(^) only 
depend on the left DD through l^vgit). 

D. Analysis of lav git) in the asymptotic limit 

The application of Wormald's Theorem to guarantee the 
concentration around the TEP expected graph evolution de- 
rived in the previous subsection is not formally possible in 
the limit n ^ 00. First, the maximum left degree /max(^) is 
not bounded and the boundedness condition in (38) might not 
hold. And second, note that the dimension of the Markov 
process is not bounded either. However, the 

asymptotic limit performance can be studied by observing the 
evolution of the average left degree hvgit), which measures 
how likely is the creation of degree-one check nodes by 
removing degree-two check nodes, see Fig. 3. 

If we assume an LDPC[A(x), p(x), n] ensemble with 
bounded complexity, i.e. finite Aavg, ^max and rmax values, then 
in the limit n oc we get pb(^ = 0) = 0. As long 
as psit) = 0, the TEP and BP solutions are equivalent. 
Therefore, the BP decoding threshold is only improved for 
those LDPC ensembles for which, as the TEP decoder runs, 
there exists some to < tx> such that 

lim lim psit) 

t— )-to n— )-oo 

t^ton^oo e{t)Jl 

or, equivalently, if there exists some to < tx, such that 

lim /avg(^) = 00, (74) 

t^to 

since the rest of the terms in (73) stay bounded during the 
TEP procedure if e > eBp. If ^avg(^) becomes infinite, the 
asymptotic decoding threshold for the TEP might be higher 
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Fig. 5. In Fig. 5(a), we plot the average left degree lavg{t) computed using the urn model for = 100 (A), 1000 (x), 10^ (o), 10^ (□), as a function 
of the fraction of processed variables. In Fig. 5(b) we plot the ratio between /avg(^) for = lO'^^-'^ and = lO'^ for j = 2 (x), 3 (o) and 4 (□). 



than the decoding threshold of the BP decoder. However, we 
cannot rely on Wormald's Theorem to find the TEP threshold 
exEP- In this section, we analyze the conditions for /avg(^) to 
go to infinity, which opens the possibility for exE? > ^bp- 
Although, we have not found so far any LDPC ensemble 
with a finite degree distribution meeting these conditions and, 
besides, the strategies followed to maximize this effect yield 
ensembles that lack practical interest, as shown later. We leave 
as an interesting open problem the search for practical LDPC 
ensembles for which exE? > cbp and, as well as, the exact 
computation of such threshold. 

Assume an LDPC ensemble of infinite length. As proven 
in Appendix A, the processing order is irrelevant and, hence, 
we can always run the BP first. The removal of degree-one 
check nodes does not increase the left average degree, i.e. 
/avg(^) is finite [11]. Over the BP residual graph, processing 
a degree-two check node can be explained with a Polya urn 
model. Consider /max(^) urns labeled 1,2,..., /max(^). In urn 
i we place a ball for each variable of degree i in the graph. 
In each iteration, we take two balls from the urns. The urns 
are chosen independently with probability proportional to the 
number of balls per urn times its label. One ball is thrown 
away and the other one is placed in the urn labeled by the 
sum of the labels of the picked balls minus 2, introducing a 
new urn if it did not previously exist. For example, if we pick 
a ball with label "3" and a ball with label "4", we put one ball 
in the urn labeled "5". We repeat this process C2 times, the 
number of degree-two check nodes in the BP residual graph. 
The resulting /avg(^) is the sum of the number of balls in all 
urns multiplied by their labels. 

It is straightforward to conclude that, as we process check 
nodes of degree two, /avg(^) increases, but does it becomes 
infinite? On the one hand, if we start with balls and C2 = 

— 1, /avg(^) becomes infinite as ^ oo. On the other hand. 



if C2 is small enough such that we do not pick twice the same 
ball, /avg(^) stays finite. There must be an intermediate value 
for C2 for which /avg(^) becomes infinite. 

Let us illustrate this result with the (3,6) regular LDPC 
code. We run the urn model described above to numerically 
compute /avg(^)- In Fig. 5(a), we plot /avg(^) for = 10^ 
(A), 10^ (x), 10"^ (o), 10^ (□), as a function of the fraction 
of processed variables C2/n^ In Fig. 5(b) we plot the ratio 
between /avg(^) for = 10^+^ and = 10^ for j = 2 
(x), 3 (o) and 4 (□). Although the urn model is only valid 
asymptotically and as long as psit) = 0, we can see there 
is a clear phase-transition. If we process less than 74% of 
the variables nodes we should not expect /avg(^) to go to 
infinity. For a BEC channel with erasure probability just above 
the BP threshold (e = 0.4295), the ratio between degree-two 
check nodes and variables in the BP residual graph is 0.625 
[16]. Therefore, for this ensemble the TEP decoder performs 
asymptotically as the BP decoder, i.e. exEP = ^bp, because 
/avg(^) Stays finite. 

In order to maximize the fraction C2 with respect to the 
number of variables in the BP residual graph there are two 
basic strategies. We can either design LDPC ensembles to 
minimize the size of the residual BP graph or we can design 
the ensemble to increase the presence of degree-two check 
nodes. Given the known analytical expressions for the BP 
residual graph [16], it is easy to prove that in the first case 
the solution yields standard irregular ensembles for which 
^BP cmap, i-e. codes for which the threshold is given by 
the stability condition [15], [16], [22]. In this case, there is 
no margin and eBp = cxep = ^map- In the second case, the 
ensemble presents severe limitations: for instance, by (47), 
any ensemble where all the check nodes are of degree two. 



IEEE TRANSACTIONS ON INFORMATION THEORY 



12 



i.e. p{x) = X, has a rate 

r{X{x),x) = 1 



/o AMd^ 



G [0, 1] X{x) = a-\-bx, 

(75) 



where a, 6 G M are such that a -\- b = 1. Either the code 
presents minimum distance of only one bit if a > or zero 
rate if a = 0. We approach this undesired behavior as we 
increase the fraction of degree-two check nodes p2. 

E. TEP decoder complexity 

In the next two lemmas we prove that, for most LDPC 
codes, namely those for which exEP = crp, the TEP complexity 
linearly scales with the code length n: 

Lemma 2: Consider a transmission over a BEC of pa- 
rameter e using a code C sampled at random from 
LDPC[A(x), n], where the polynomials \{x) and p{x) 
are of finite order and the code length n oo. Let 
Aavg(t, y,C) and ©avgl^^y^C) be the evolution under TEP 
decoding of the average variable and check node degrees when 
the channel realization is y. For any vector y, Aavg(t, y, C) and 
©avg(t,y,C) are bounded during the whole decoding process. 
Proof: See Appendix C. ■ 

The boundedness property of the variable degree evolution 
under TEP decoding, proved in Lenmia 2, has a significant 
impact in the decoding complexity. 

Lemma 3: The complexity per iteration of the TEP algo- 
rithm for decoding LDPC codes of positive rate and finite 
maximum variable and check node degrees remains constant 
for any code for which ejEP = cb?. Compared to the BP 
complexity, the TEP complexity differs at most in a constant 
complexity per iteration. 

The complexity per iteration can grow as a function of n 
only for e > eep and for those ensembles for which the limiting 
condition in (73) is fulfilled. 

Proof: See Appendix D. ■ 

E TEP decoder differential equations 

Unlike the asymptotic case, for finite-length LDPC ensem- 
bles such that ejEP = ^bp, Wormald's Theorem can be applied 
to estimate the expected graph evolution along the decoding 
process. In particular, we are interested in the evolution of 
ri(t) and r2(t), which is basic to predict the finite-length 
performance as we later discuss in Section V. Consider the 
TEP decoding of an arbitrarily large LDPC[A(x), n] 
ensemble with finite code length. First, note that expressions 
(53) and (54) play the role of the trend functions in Wormald's 
theorem: 



R,{t^l)-Rj(t) 



E 



E ' E ' ' 
Li{t^l)-Li{t) 

yE' E ' E ' E ) 



E 



(76) 



(77) 



for z = 1, . . . , ^ and j = 1, . . . , rmax, where L^yg{t) = 
iLi{t). Regarding the bounding condition in (38), for finite 
ensembles, any individual realization \Rj{t -\- 1) — Rj{t)\ and 
\Li{t-\-l)—Li{t) I are bounded by E at any time. This condition 
also ensures the Lipschitz continuity of the functions fLi{b) 
and fRj{b) in (76) and (77) for all b contained in the set 
V defined in (37). In this case, V is simply the region of 
possible initial conditions for the DD, i.e. the region within 
the hypercube of unit length and dimension E + rmax- 

Given the former discussion, Wormald's Theorem ensures 
that when we solve the differential equations: 



dr 

dr 

dliir) ^ 
dr 



Irj (r, /avg(r), ri(r), . . . , r^^^(r)) , 



(78) 



= E 'f^i ^2 (r ),..., (r) , n (t) , r2 (r) ) , (79) 

i 



(80) 



for j = 1, . . . , rmax and i = with initial conditions 

/i(0) = E[Li(0)]/^ and rj(0) = E[i?j(0)]/^, then the 
solution for ri(r) and r2(r) is unique and, with high proba- 
bility, by (42), does not differ more than 0{E~^^^) from any 
particular realization of Rj{t)/E for j = 1 and 2. Because of 
the expected graph evolution equations derived in Section IV-C 
assumed independency, the deviation that we can expect with 
respect to the solution given by (78)-(80) might rise up to 
0{E~^^^ + E~^). As we show in Appendix B, any cycle 
involving m variables decays as E~^. The most dominant 
component, i.e. E~^, is not significant compared to maximum 
deviation guaranteed by Wormald's theorem. 

Note that the initial conditions, /^(O) and rj(0), contain the 
information about the ensemble LDPC[A(x), p(x), n] and the 
channel [16], [11]: 

/^(O) = eXi, i < Ux, (81) 

m>j ^ ^ 

(82) 



It is important to remark that, as described in Section IV- A, 
the solution of (78)-(79) only holds for all t < ti^. In our 
scenario, the stopping time tx> is given by either the time 
instant at which the decoder stops because there are no degree- 
one or degree-two check nodes or the time when e(r) cancels, 
denoting that all variables in the graph have been decoded. 

Let us illustrate the accuracy of this model to analyze 
the TEP decoder properties for a very large code length, 
n = 2^*^. In Fig. 6(a), we compare the solution of the system 
of differential equations in (78) and (79) for Ri{r) = ri{T)E 
and R2{t) = r2{r)E for a regular (3,6) code with 20 par- 
ticular decoding trajectories obtained through simulation. We 
consider two cases: below (e = 0.415) and above (e = 0.43) 
the BP threshold, i.e. eBp = 0.4294. We depict their evolution 
along the residual graph at each time, i.e. e{r). We plot in thick 
lines the solution of our model for Ri{r) (<) and R2{t) (o), 
and in thin lines the simulated trajectories, Ri{r) in solid and 
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Fig. 6. In (a), we include the predicted evolution of Ri(t) (<) and R2(t) (0) for a regular (3,6) code with n = 2-^^ and an erasure probability 
e = 0.415 < cbp (top) and e = 0.43 > cbp (bottom), where cbp = 0.4294. In (b), we include the same result for the irregular DD defined in (83) and 
(84) and an erasure probability e = 0.47 < cbp (top) and e = 0.483 > cbp (bottom), where cbp = 0.4828. We include in thin lines a set of 20 individual 
decoding trajectories chosen at random. 



R2{t) in dashed line. In Fig. 6(b), we reproduce the same 
experiment for the following irregular LDPC ensemble, 

A(x) = ix+^x^ (83) 
6 6 

p{x)=x\ (84) 

where the BP threshold for this code is eBp = 0.4828. For this 
code, by running the urn model procedure described previously 
in Section IV-D, we also find that /avg(^) does not scale with 
n. 

In both cases, when we are below the BP threshold both 
the expected evolution curves and the empirical trajectories 
represent successful decoding since degree-one check nodes 
do not vanish until e(r) tends to zero. Besides, note also that 
the longest deviation happens around r ^ 0.12 and r ^ 0.18, 
when the predicted curves for both Ri{r) and R2{t) have a 
relative minimum. This point is known as critical point and 



plays a fundamental role in the derivation of scaling laws to 
predict the performance in the finite-length regime [19], [11], 
as explained in Section V. In [19], [20], the authors show that 
the graph initial DD are Gaussian distributed around the mean 
in (81) and (82). Furthermore, they observe that as the PD 
performs, individual realizations are also Gaussian distributed 
around the mean computed in [16] using Wormald's theorem. 
And they show that the standard deviation is 0{l/\^), lower 
than the one warranted by Wormald's theorem. This result is 
a consequence of the channel properties. In Fig. 6 we observe 
the same results for the TEP decoder. 
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V. TEP DECODING OF LDPC ENSEMBLES IN THE 
FINITE-LENGTH REGIME 

A. Motivation 

In [29], [30], we empirically observed the gain in perfor- 
mance obtained when the TEP decoder is applied to decode 
some finite-length LDPC ensembles over the BEC. The gain 
of the proposed algorithm for practical codes can be analyzed 
from a different perspective based on the work by Polyanskiy 
et ah [49]. They present bounds on the maximum achievable 
coding rate for binary memoryless channels in the finite-length 
regime. These bounds can be regarded as the extension of 
the Shannon coding rate limit when the number of channel 
uses, i.e. code length, is fixed. For a given channel, a target 
probability of error Pw and a desired code rate r, we can 
compute the minimum code length iimin for which there exists 
a code that satisfies these requirements. 
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Fig. 7. Dependency-testing (DT) lower bound to the maximal achievable 
rate with supreme block error rate iV < 10~^ over a BEC of parameter 
e = 0.5 [49]. We include the minimum code length computed to obtain a 
valid performance for both the TEP and the BP and two LDPC ensembles A 
and B of rates = 0.325 and = 0.365. 

In Fig. 7, we plot the dependency-testing (DT) lower bound 
in [49] to the maximal achievable rate for a target block error 
rate of Pw = 10~^ over a BEC of parameter e = 0.5. In [49], 
this lower bound is shown to be tight and much simpler to 
compute than the non-asymptotic maximum achievable rate. 
We considered two LDPC ensembles, referred to as code A 
and B. In both codes, \{x) = x^, while Pb{x) = 0.25x^ + 
0.75x^ and Pa{x) = O.bx^ + O.bx^. The rate of each code 
is, respectively, ta = 0.325 and tb = 0.365. In Fig. 7, we 
depict the minimum code length for both the TEP and the BP 
decoders to empirically obtain a performance below 10~^ for 
each one of the two LDPC ensembles proposed. We observe 
that the use of the TEP decoder reduces by roughly half the 
block length to the optimum case given by the DT bound. 
Since the complexity of both algorithms is of the same order 



for these ensembles, the TEP decoder emerges as a powerful 
method to decode practical finite-length LDPC codes. 

In the light of these results, we focus on a theoretical 
description of the TEP gain with respect to BP for finite- 
length codes. We show that the TEP differential equations 
proposed in Section IV-F are the key to measure and predict 
such gain. This result is used to extend to the TEP decoder 
the closed-form expressions proposed in [19], [20] to estimate 
the BP performance for some LDPC ensembles of regular or 
quasi-regular DD. They are referred to as BP scaling laws 
(SLs). For the TEP, we propose a simple SL, in which all 
parameters are analytically known as a function of the DD. 
We start by reviewing some important steps in the analysis of 
the BP performance for finite-length LDPC ensembles that are 
need for the TEP finite-length analysis. 

B. BP decoder in the finite-length regime 

In [19], [20], the authors proved that the BP performance 
for finite-length LDPC codes can be predicted by analyzing 
the statistical presence of degree-one check nodes at a finite 
set of time instants during the whole decoding process. These 
points are referred to as critical points. 

Definition 1: BP-critical point of an LDPC ensemble. For 
a given LDPC ensemble with BP threshold 6bp? 

let rf (r) 

be the expected evolution of the fraction of degree-one check 
nodes under BP decoding at e in the limit n ^ oo. We say 
that r* is a BP-critical point of the ensemble if 



lim 



rf(7 



^) = 0. 



(85) 



In [16], [43], the authors analytically compute r^^(r) as 
a function of the LDPC degree distribution and the channel 
parameter e: 



rf^{u) = e\{u) U - 1 + p (1 - e\{u)) 



(86) 



where ^ = The decoding process starts at = 1 and 
finishes at = 0. Let X^^^(r, n) be the random process that 
represents the evolution of the fraction of degree-one check 
nodes along the decoding process for a given ensemble of code 
length n. Note that any realization x^^ (r, n) of such process 
represents a successful decoding as long as it is positive for any 
r G [0,n/£^). The process X^^^(r, n) presents some important 
properties [19], [20]: 

1) E[X^P(r,n)] closely follows r?P(r) in (86). For 
moderately-sized codes the mean of the process is essen- 
tially independent of n. 

2) The variance is of order (9(n~^). We denote it as 

3) For any r, the distribution of X^^^(r, n) tends with n to 
a (truncated) Gaussian pdf. 

Let us focus for simplicity on LDPC ensembles with a single 
critical point at r*^ In [19], [20], it is shown that the BP 
decoder at e = eep+Ae successes with very high probability as 

^This includes any regular ensemble and typically codes with small degree 
of irregularity, for instance the code defined in (83) and (84). 
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Fig. 8. In (a) we show a set of 100 realizations of X^^{t, n) for n = 2^^ 
and e = 0.425 along with the DE mean prediction in (86), thick line. 



long as X^^^(r*,n) > 0, i.e., if there exists a positive fraction 
of degree-one check nodes at the BP-critical point. In Fig. 8, 
we include a random set of 100 realizations of Xf^{r^n) for 
the (3, 6) LDPC ensemble, n = 2^^ and e = 0.425 along with 
the Density Evolution (DE) mean prediction in (86), in thick 
line. The ensemble has a single critical point at r* = 0.07. 
Around this point, we can see that the DE curve reaches a 
relative minimum and the global error probability is clearly 
dominated by the failure probability at this point. Thus, the 
finite-length performance can be roughly estimated by just 
evaluating the cumulative density function (cdf) of X^^ (r, n) 
at the unique critical point r*: 



1 -E 



(A(x),p(x),n) 



[P^^C, e)] < p {X^^iT\n) > 0) , (87) 



where P^{C^ e) is the average block error probability for the 
code C G LDPC[A(x), p(x), n] under BP decoding. To evaluate 
(87) assuming X^^^(r*,ii) is Gaussianly distributed, we need 
its mean and variance. The mean can be computed for each 
e from (86), but in [19] the authors show that the first-order 
Taylor expansion around the BP-critical point, i.e: 



.BP( 



r*) = 



de 



Ae, 



(88) 



is as precise and more convenient, as it allows computing a SL 
that measures the performance as a function of the distance to 
the threshold. 

Closed- form expression for the variance, 5^1^^ l'^*)' ^^Y 
LDPC ensemble can be found in [21] and, for large enough 
n at the critical point, we can ignore the dependency of 
^rr,ri('^*) with respect to e [19]. The error probability can be 
approximated by [19]: 



E 



(A(x),p(x) 



V^(eBP - e) 



(89) 



"BP 



drf{T) 
de 



(90) 



For sufficiently large code lengths, this scaling function pro- 
vides an accurate estimation of the BP error probability. A 
comparison between (89) (dashed lines) and empirical BP 
performance curves (solid lines) can be found, respectively, 
in Fig. 11(a) and (b) for the regular (3,6) LDPC ensemble 
and for the irregular LDPC ensemble in (83) and (84). 

C TEP decoding in the finite-length regime 

For describing the TEP decoder finite-length performance, 
we follow a similar approach and study the random process 
Xj^^^(r, n), which represents the evolution of the fraction of 
edges with right degree 1 along the TEP decoding process for 
a given ensemble of code length n, around its local minima^. 
As for the BP, the analysis centers on the points during 
the decoding processes for which the presence of degree- 
one check nodes vanishes. As for the BP analysis, we define 
similarly the critical points for the TEP decoder and we also 
prove that these critical points are identical for both decoders. 

Definition 2: TEP-critical point of an LDPC ensemble. For 
a given LDPC ensemble with TEP threshold ctep — ^bp? 

let 

r|^^(r) be the expected evolution of the fraction of degree- 
one check nodes under TEP decoding at e in the limit n ^ oo. 
We say that r' is a TEP-critical point of the LDPC ensemble 
if 



lim 



(92) 



Lemma 4: For a given ensemble LDPC[A(x), p(x)], the 
number of TEP critical points is equal to the number of BP 
critical points. 

Proof: In the regime n oo, we have proven that, 
with very high probability, the TEP decoder is not able to 
create any additional check nodes of degree one with respect 
to the BP solution and, as a consequence, ejEP = ^Bp. Besides, 
given the TEP independence of the processing order proved 
in Appendix A, we can implement the TEP decoder by first 
removing all the degree-one check nodes, i.e. a BP stage, 
and then process degree-two check nodes if the BP does 
not succeed. Hence, for e eep" and n ^ oo, ri^(r) 
and r|^^(r) match during the whole decoding process and, 
therefore, the ensemble presents the same number of critical 
points for both decoding algorithms. ■ 



^We could have also studied the evolution of 

Xll\ (r, n) = Xlf{r, n) + X^f (r, n), 



(91) 



where Xj^^ (r, n) represents the evolution of the fraction of edges with right 
degree j along the decoding process for a given ensemble of code length 
n. But the processing of degree two does not decode any additional variable 
unless degree-one check nodes are create and hence we only focus on the 
evolution of the random process Xj^^ (r, n) that tell us how many additional 
variables we can decode. 
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Fig. 9. For the (3,6) ensemble, we plot ri (r^, n, ctep) x n (solid lines) 
computed from (78)-(80) for different code lengths, where ctep = 0.4294. 
We include in dashed lines sample average curves obtained by simulation. We 
have averaged them over 500 realizations. 



The process (r, n) behaves as X^^ (r, n) and present 
the same statistical properties, namely: 

1) E[Xj^^^(r, n)] closely follows r|^^(r), which is com- 
puted as solution of the TEP decoder differential equa- 
tions in (78)-(80). 

2) The variance decays as (9(n~^). We denote it as 
<^Jf;,(T)/n. 

3) For any r, the distribution of Xj^^^(r, n) tends with n to 
a (truncated) Gaussian pdf. 

As discussed before, for codes with a single critical point r^ 
we estimate the TEP finite-length performance by computing 
the cdf of the process X^^^(r, n) at r': 



,.)[P^^^(C,e)]^p(X,TfV,n)>0) (93) 



Ha(x),p(x),] 

where P^^^(C,e) is the average block error probability for 
the code C G LDPC[A(x), /)(x), n] under TEP decoding. To 
evaluate (93) assuming Xjf^{r^,ii) is Gaussianly distributed, 
we need its mean and variance, which can be computed 
analogously to the BP case. The mean can be computed using 
the first-order Taylor expansion around the TEP-critical point, 
i.e.: 

Sri(r,n,e) 



rf^(r^n,e)=r^^^(r^n,eTEP) 



de 



■t' 

Step 

(94) 

In this case, there exists a nonzero value for the zero-order 
Taylor expansion that indeed explains the improvement of the 
TEP decoder over the BP for finite-length codes. For the BP 
decoder, the mean process evolution in (86) is independent 
of n. Therefore, at eep and the corresponding critical point, 
(r*,n, eBp) = 0'^ and hence the expression in (88). For 



^BP 



the TEP decoder, the mean evolution depends on the code 
length n and the mean at the critical point is nonzero. We 
can understand why r|^^^(r', n, exEp) is nonzero by assuming 
that we first process all degree-one check nodes, and then 
all degree-two check nodes. After we process all the degree- 
one check nodes for e = e^p and, at r = r* and large 
n, we should get stuck at the BP critical point with high 
probability and hence rf^(r*, n, eep) ~ 0. Now, when we 
process all the degree-two check nodes, we move from r* 
to and we can generate degree-one check nodes for finite 
length codes, because Pb{^) in (58) is nonzero. This value is 
captured by a nonzero rj^^^(r', n, ctep) in (94). The evaluation 
of rf P(r' , n, ctep) for different code lengths using the TEP 
differential equations shows that it decays as 1/n, namely: 



v,TEP 



(r^n, ejEp) = Tteph 



(95) 



because the probability that two variables nodes share the 
same two check nodes decays as and we process a fraction 
of n of degree two check nodes after we have process all 
degree-one check nodes. For instance, in Fig. 9 we plot 
ri(r', n, exEp) x n (solid lines) computed from (78)-(80) for 
different code lengths, where ejEP = 0.4294. We include in 
dashed lines sample average curves obtained by simulation. 
Note that all curves converge at the critical point. The exact 
computation of 7tep is obtained by solving (78)-(80) and it is 
shown in Appendix E how it can be computed for each DD. 
For instance, the (3, 6) LDPC ensemble has a 7j£p = 0.3194 
and the irregular LDPC ensemble in (83) and (84) presents a 
7fEP = 0.2925. 
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^The BP gets stuck because it has run out of check nodes of degree one. 



Fig. 10. We include the TEP sample mean evolution (rJ^^(T) — 
7TEpn~"'^)/Ae (in solid lines) computed using a set of 500 realizations of 
X^P(r, n) for the (3, 6) ensemble with n = 2^^ and e = 0.415 (o), e = 0.42 
(□) and e = 0.425 (o). We also include the BP sample mean evolution in 
dashed lines for comparison. 
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Fig. 11. In (a), we include the BP/TEP performance for the regular (3, 6) ensemble (solid lines) along with their respective SL approximations in (89)/(96) 
(dashed lines) for n = 2^° and (+). The SL parameters are a^p = ajEP ~ 0.5603, ^J^^ni^*) ~ 0.0526 and 7tep ~ 0.3194. In (b), we reproduce the 
same figure for the irregular LDPC code defined in (83) and (84). The SL parameter is a^p = q;tep ~ 0.5791, Sj.^^^^ (r*) ^ 0.0593 and 7tep ~ 0.2925. 



By (93), we estimate the TEP performance as follows: 



/ / arf (T.n.e) 
9e 



l-Q 



\ 



Ae + 7TEpn 



T — T 

e^exEP 



V 



= Q 



Vn(eBP - e) 



"BP 



7tep 



(96) 



where P^^^(C,e) is the TEP block error rate for the code 
C G LDPC[A(x), n] and we have used the values com- 
puted in [19], [21], [50], [51] for the BP (namely aep and 
^rin ('^*)) the TEP performance. These values measure the 
mean and variance of the degree-one check nodes and mainly 
depend on the DD ensemble and the channel dispersion [49]. 
We could expect the value of t=t ^rf,ri('^0 

to slightly grow with respect to the BP estimates, because 
we are additionally processing degree-two check nodes. Em- 
pirically, we observe that ^^^^Jj^'^^ | ^^^/ is slightly larger 

=eTEP 

10, and we observe an 

=eBP 



than 



I , as shown in Fig. 

e=eBP 

insignificant reduction in the variance. 

In Fig. 11, we compare the SL solution in (96) in dashed 
lines with real performance data obtained through simulation 
in solid lines for the regular (3,6) LDPC ensemble (a) and 
for the the irregular ensembel defined in (83) and (84) (b). 
The match between dashed and solid lines is as good as 
for the BP estimates as for the TEP, showing the accuracy 



of the model for the TEP performance and the proposed 
parameter estimation. In both plots, due to the parameter 
overestimation when assuming the BP values for (5j^^^(r')/n 
and axEP, we slightly overestimate the TEP error probability. 
The overestimation is also augmented because in our estimate 
of ri(r, n, e) under TEP decoding we do not take into account 
scenarios where two variables that share a degree-two check 
node, also share more than one additional check node. For 
the code lengths considered, these scenarios can be eventually 
found and the mean can be slightly higher. In addition, there 
exists Xjf^{r^B.) trajectories that go to zero at some point 
but eventually can be positive again by removing degree-two 
check nodes. 

VI. Conclusions 

In this paper, we present the expectation propagation algo- 
rithm to address the LDPC decoding process over any DMC. 
The posterior distribution of the variables is approximated 
with a Markov-tree probability distribution, over which the 
marginal estimate can be efficiently performed. By construc- 
tion, the solution improves the BP estimate. For the binary era- 
sure channel, we show that the tree-EP algorithm reduces to a 
peeling-type algorithm, i.e. the TEP decoder, that outperforms 
the BP solution by additionally processing degree-two check 
nodes and not only degree-one check nodes. The TEP decoder 
improvement in performance is significant for practical finite- 
length LDPC codes. We prove that the complexity of this 
additional step is bounded even in the limiting case and 
therefore, the TEP complexity is of the same order than the 
BP algorithm. In the asymptotic regime, we have shows the 
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conditions to be fulfilled by an LDPC ensemble to improve 
the BP threshold. The TEP decoder differential equations 
provide the graph mean evolution under TEP decoding. Using 
Wormald's theorem, we proved that the dispersion around the 
mean evolution is bounded. This results are used to explain and 
predict the TEP decoder improvement. Along with empirical 
support, we have developed a scaling law to predict the 
performance. The analysis of the BEC case developed in detail 
in this paper for both the asymptotic and finite-length regime 
is a guideline to construct efficient implementations of the 
Tree-EP algorithm to outperform the BP solution in other 
channels like the binary symmetry channel and the binary 
additive Gaussian noise channel. 
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APPENDIX A 
TEP SOLUTION AND PROCESSING ORDER 

We prove that the TEP decoder solution is independent of 
the processing order, i.e., the order in which check nodes 
of degree-one and two are removed. Given the parity check 
matrix of a hnear block code H, we first show that any 
operation of the TEP decoder has an associated linear operator 
over H. Finally, we prove that these operations commute, 
proving the independence on the processing order. 

Consider the k x n parity check matrix of the code, H. The 
TEP algorithm is initialized by removing from the graph all 
the known variables. The removal of any of these variables 
from the graph is equivalent to applying a binary linear 
transformation over the matrix H to get a new one, H, where 
the variable Vg is completely disconnected, i.e. 



H = HA, 



(97) 



where A^ is an n-dimensional identity matrix where the s-th 
element is zero. 

To remove a check node of degree two from H, e.g., Pj 
connected to variables Vq and Vr, the resulting matrix, H, is 
obtained as follows: 



H — HBo 

-^o,r — -^o ~l~ '^o,r 



(98) 
(99) 



where o ^ r and Zo^r is an n x n zero-matrix with{Z o^r){o,r) = 
1. We have assumed that variable Vq has been removed and 
Vr inherits its connections. The symmetric transformation, 
i.e. remove Vr instead of Vq, is performed by applying the 
^r,o operator. Since the final result is one variable with all 
connections, B^^o is equivalent to Bo^-^. With no loss of 
generality we assume that the removed variable is the leftmost 



variable in Tanner graph or, equivalently, the leftmost column 
in the parity check matrix. 

When applying As after a sequence of operations (transfor- 
mations), variable s must be connected to a degree-one check 
node. If it was not in the original graph, in H, some operations 
are needed first. Similarly, for B^^^ we need variables o and 
r to share a degree-two check node. Therefore, not every 
sequence of operations is valid. 

We study the commutativity of two valid sequences / and 
//. If we prove that every pair of operations commutes, we 
may reorder the operations in the second sequence as in the 
first one. Particularly, we have to show the conditions for 
which following matrices commute: 

1) As and Ap commute for all s^p, 

2) Bo,r and ^v,z conmiute for all possible pairs {o, r} and 
{v, z} foTo^r and v ^ z, 

3) As and ^o,r conmiute for all the possible triples {s, o, r} 

for o ^ r, 

where s, o, r, v and z belong to {1, . . . The first case is 
straightforward since diagonal matrices always conmiute. 

For the other two cases, we need to prove the conditions for 
which Bo,rBv,2 = ^y z^o,r^ which can be express as follows 
using (99): 

^^o^r^v ~ A^A-y -|- A(^Za^ 2, ~I~ ^o,r-^v ~l~ ^o^r^v^Z") (100) 
'^v^z^o,r ~ -^v 

Ao + A 

v^o,r ~l~ ,z-^o ~l~ ^v,z Zo,r- (101) 

Note that, to prove both 2) and 3), we have to show that 
the matrices A^ and Zy^z commute for all the possible triples 
{o, V, z} and that Zo^r and Zy^z commute as well. Regarding 
the first case, we have: 



'^v,z-^o 



A-c^Zy 



^Onxn, Z = 

^v,z, V^O 
Onxn, V = 



(102) 
(103) 



where Onxn is the n-square zero matrix. Hence, we can 
conclude that A^ and Z^^z commute for every triplet, as long 
as o ^ z and o v.\i o = v.Z^^z and A^ do not commute, 
but neither Z Q zAq nor AqZq z are valid sequences, because 
once we have removed variable o, we cannot remove it again. 
If o = z, Zy^o and A^ do not commute, but in this case 
Z^^oAo is a valid sequence, while the reverse is not valid. But, 
if we examine the operation B^^oAo closely, it means that v 
and o share a degree-two check node and o is connected to 
degree-one check node. Hence, we can find in Sequence / the 
operations B^^^Ao and in Sequence // the operations A^A^, 
because once we have remove variable o, the check node that 
variables v and o shared is now a degree-one check node, and 
these two operations are equivalent, as it can be seen in in 
Fig. 12. 

Finally, to conclude the proof, we have to show that the 
matrices Zo,r and Zy^z commute. It is easy to check that: 



^o,z, r = v 
Onxn, Otherwise 



(104) 
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Fig. 12. The TEP decoder can process the graph by either removing the 
degree-one check node Pj and then process the degree-one check node Pi 
(once Pj has been processed Pi becomes degree one) or, indistinctly, remove 
first the degree-two check node Pi and then process the degree-one check 
node Pj. 

If the four indices are different the matrices commute. If one 
of the indices is repeated, we have three possible scenarios: 

1) V = o, the matrices commute and ^o,r^o,z and ^o,z^o,r 
are invahd sequences, because we would remove variable 
o twice in both cases. 

2) z = r, the matrices commute and both sequence are valid. 

3) V = r (or equivalently o = z), the matrices do not 
commute and one order gives a valid sequence, i.e. 
^o,r^r,z, and the other does not, i.e. ^r,z^o,r, because 
once we have removed the variable r we cannot use it 
again. But we could find in the Sequence / the matrices in 
this order ^o,r^r,z and in the Sequence // the matrices 
in this order B^^^B^^^, because after we process B^^^, z 
inherits all the check nodes of r, hence if r had a degree- 
two check node with o after we have processed ^r,z^ now 
this degree-two check node is between o and z. 

Hence, we have proven that any two matrices in the first 
sequence commute or have a valid alternative in the second. 

Appendix B 
Probability of sharing two check nodes 

In this appendix we compute the probability psit) in (58). 
Recall that it is defined as the probability for scenario 5^, in 
which two variables share a check node of degree two and, at 
least, another check node of arbitrary degree, as illustrated in 
Fig. 4(b). Let 5^"^ the particular case where both variables 
share m + 1 check nodes, one of them at least of degree two. 
If we compute the probability 

p{S^-\S2), (105) 

then the probability psit) is computed by summing over the 
parameter m: 

PB{t) = p{Si\S2) = ^piSi- \S2). (106) 

m 

In what follows, we proceed by evaluating the probability 
of scenarios and to finally conclude that 

p{Si' \S2) » p{Si' \S2) » p{S^' \S2) ■ ■ ■ (107) 

and, for large enough graphs, we have 

p{Si\S2) =p{S^'\S2) + 0{l/E% (108) 

which means that, in practice, we only have to consider the 
scenario <S^^ to study the TEP decoder for large code lengths. 



1) Probability of scenario <S^^* We first focus on the 
scenario illustrated in Fig. 4(b). Our goal is to compute 
the probability p {S2^''^\S2^, where 5^^'"^ corresponds to the 
case where Vp and Vq share just a check node of degree two 
and a check node of degree j. Then, the probability (108) is 
computed by sunmiing over the degree j: 

P {S2'\S2) = (^2^"' 1^2) . (109) 

i=i 

If dvSj) denotes the number of edges connected to Vq 
that have right degree j, we propose to obtain the probability 
p {S2^'-^\S2^ by marginalizing over the following probability 
function: 

p{S^'^\dy^,dy^,dySj) = 0,dy^{j)=^\S2)^ (110) 

for (9 G {0, . . . , dy^ - 1}, 7^ G {0, . . . dy^ - 1}. The mass 
probability function in (110) represents the probability that 
Vo and Vr share another check node of degree j, the degree 
of Vo and Vr is, respectively, dy^ and dy^, Vq has edges 
of right degree j and Vr has edges of right degree j. The 
probability density function in (110) can be rewritten applying 
the properties of conditional probability: 

p{S^'^\dy^,dy^,dySj)=0,dy^{j)=d\S2) 

= p{S^^^^\dySj) = O^dy^j) = ^,dy^,dy^,S2) 

•p{dy,U) = ^^dVrij) = '^\dy^,dy^,S2) 
•p{dy^,dyjS2). (Ill) 

The degrees of two given nodes in the graph are (asymptot- 
ically) pairwise independent [48] and, hence, for sufficiently 
large graphs we can assume 

p{dy^ , dy^ \S2) = ^ ^ + 0{E-')^ (1 12) 

e{t) e{t) 

for dy^^dy^ G {2, . . . , E"}. Each edge connected to a variable 
node has right degree j with probability rj{t)/e{t). The 
number of edges connected to either Vq or Vr with right degree 
j are asymptotically distributed according to binomial distri- 
butions B{0, dy^ - 1, Tj (t) /e{t)) and B{^, dy^-1, rj {t)/e{t)) 
respectively, where B{x,N, P) denotes a binomial distribution 
over X G N with parameters N and P [52]. Note that we 
subtract 1 to dy^ and dy^ since we know that one edge 
is connected to a check node of degree two. Therefore, the 
second term in (111) is 

p{dySj) = 0,dy^{j) = ^dy^,dy^,S2) = (113) 

Finally, we have to compute the first term in (111), i.e., the 
probability that variables Vq and Vr share a check node of 
degree j when the variables have, respectively, and edges 
with right degree j. Note first that the graph has rj{t)E edges 
with right degree j. Let us assume that the edges of one of 
the variables, Vq for example, are fixed and that the edges of 
Vr are now randomly set. 
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If the graph is large enough, with probabihty 



(114) 



each edge of Vr shares a check node of degree j with an edge 
of Vq. For large graphs, the number of checks shared between 
both variables is asymptotically described by a binomial 
distribution 5(^,6>(j - 1)/Erj{t)). The probability that they 
share al least one check node of degree j in (111) yields 

p{St \dv^ {j) = 0, dv^ ( j) = 1?, dv^ , dv^ , S2) 



0{j - 1) 



Er,(t) 

m - 1) 
Erj{t) 



1 



0{3 - 1) 
Erj{t) 

+ 0{E-'^). 



(115) 



We have already computed all the factors in the joint mass 
function in (111). In (117) (see next page), we marginalize 
over dv„ , dy^ , and "d to obtain p {82^ \S2^\ 



(<Sf'^|52) 



(/avg(t) - 1)' 



(116) 



Ee^{t) 

Finally, the probability of scenario <S^^ is computed by 
summing over the degree j as follows 

(Zavg(*) - l)'(ravg(*) - 1) 



' max 



Ee{t) 



(118) 

As expected, the probability of two variables sharing a 
check node in a random graph (aside from the check node of 
degree two that we know they are sharing) is 0{E~^e{t)~^), 
i.e., inverse to the total number of edges in the graph. This 
result is consistent with the theory of random graphs [53]. 

2) Probability of scenario 5^"" for u > 1: A similar 
analysis can be extended for the case If we define the 
subscenario 5^^'^'^, where both variables share two check 
nodes of degrees j and ^, the probability ^>(<S|f ^ |<52) is obtained 
by counting over all possible cases: 



(119) 



The probability p(52^''^'^|52) can be obtained with a similar 
procedure than the one used to compute p(5^^'^ |52). In this 
case, we marginalize over the joint mass probability function 
of the degrees of both variables (dy^^dv^) the number of 
edges in each variable with right degree j {dv^{j) ^ dy^{j)) 
and the number of edges in each variable with right degree 
£ {dy^{tj ^ dy^{tj) . Now we factorize the joint mass function 
applying the conditionality properties, as we did in (111): 

^(5^2'^'^, dy^ , dy^ , dy^ (j), dy^ (j), dy^ {£), dy^ (^) |^2) 

= p{S^' '^'^ I dy^ {£) , dy^ {£) , dy^ ( j) , dy^ ( j) , dy^ , dy^ , ^2) 
. p{dy^ {£) , dy^ {£) , dy^ ( j) , dy^ {j) = f3 \ dy^ , dy^ , ^2) 
•p{dy^,dyjS2), (120) 

where the last factor p{dy^ , dy^\S2) does not change with 
respect to (112) and the second one, similarly to (113), can be 



expressed as a product of binomial distributions. In the first 
term in (120), we have fixed the degrees of Vo and Vr and the 
number of edges of both variables with right degree j and £. 
Following the same procedure considered to derive (115), it 
can be shown that 

p{S^'''''\dv^ {i), dv^ (£), dv^ (i), dv^ (j), dv^ , dv^,S2) 



dvo{j)dvr{j){3 



Erj{t) 



l)dvA^)dvAm-l) ^j,-2 

— (X 11/ 



En{t) 



(121) 



The constant E ^ does not depend on the marginalization 
in (119). Hence, 



p(5f^|<S2)«-^ 
and, in general we have: 



p{S2'^\S2) OC 



(123) 



for m G N. Therefore p(5^'^|52) for m > 2 are negligible 
compared to p(<S^^ \S2) for sufficiently large graphs. 

Probability PB{t) is crucial to evaluate the expected graph 
evolution under TEP decoding in Section IV. In addition, for 
asymptotically large graphs, to evaluate how the graph changes 
in one iteration of the TEP decoder, it is a good approximation 
to consider that two variables that are sharing a check node of 
degree two, share at most one extra check node, as illustrated 
in Fig. 4(b). 

Appendix C 

boundedness on the graph degrees. proof of 
Lemma 2 

The proof of this lenmia is straightforward but it is conve- 
nient to review first some basic definitions. We make use of 
LDPC ensembles and degree distribution polynomials, which 
are defined in Section IV-B. 

A. Design rate, code rate and redundant codes 

Consider an LDPC ensemble defined by {\{x)^p{x)) with 
finite maximum degrees. Let Hn be the parity matrix of a code 
sample of length n chosen at random and let row(Hn) and 
col (Hn) denote respectively the number of rows and columns 
of the matrix. In the limit n ^ cx) the expected presence of 
double edges in Hqo is zero [11] and the matrix presents the 
exact DD defined by {\{x), p{x)). The following expression 
is true in this case: 

^'"^ r, (124) 



^ _ rOw(Hoo) ^ ^ _ AavglHo 



COl(Hoo) 



Oa 



= 1 - 



Oa 



"avglHoo ^avg 

where r is the design rate, Aavg|Hoo QavglHoo the 
average node and check degrees that we compute from Hoo 
and Aavg and 6avg are the same parameters computed directly 
from the pair {\{x),p{x)) using (48) and (49). Because the 
ensemble has bounded maximum degrees, the design rate is 
finite. 

Besides, the true rate rtme of the code Ho© is defined as 
rank(Hoo) 



1 



col(Hoo) 



G[0,1] 



(125) 
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E 



hv^ {t) Idvr (*) y^ fdvp-'i- 



e{t) eit) 
'dv, - 1 



E 



1? 



e{t) 



e{t) J e{t) 



e{t) 



Erj{t) 



E '-^'-^^-i^ndvMndvM)] 



dvo ^dvr 



e{t) e{t) Erj{t)- 
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and it can be easily verified that r < rtme. where in the 
limit n — )■ oo the equality only holds for a certain subset of 
ensembles [22], [54]. 

Although unusual, we can construct LDPC ensembles with 
negative design rate. For instance, the ensemble: 



X{x) 



p{x) 



(126) 



has a design rate r = — 1 but, despite this, the DD is well- 
defined and the construction of graph samples follows the 
standard procedure [11]. When we sample from such class 
of random ensembles, we obtain parity matrices that contain 
dependent rows, which are sometimes referred to as redundant 
LDPC parity matrices. By extension, ensembles such as (126) 
are called redundant ensembles [55], [56]. 

Redundant ensembles also appear when we analyze the 
residual graph after transmission over the erasure channel. Let 
(A(x), p(x)) define an LDPC ensemble of positive design rate 
and finite maximum degrees that is used for transmission over 
the erasure channel. As shown in [16], the DD that defines the 
residual ensemble after removing the set of known bits from 
the Tanner graph has the form 



\{x) = A(x), 

''^max 



(127) 
(128) 



where 



Pj 



= ^E/'.(7"i)^^(i-^)'""^- (129) 

The ensemble ^A(x),p(x)^ has, in most cases, a negative 
design rate^. An important property of this ensemble is that 

^For instance, for the (3, 6) ensemble and e = 0.42 we get 

\{x) = x^, p{x) = 0.0656 + 0.2376X + 0.3441x2 + 0.2492x^ 
+ 0.0902^"^ + 0.013x^, 
for which the design rate is —0.1448. 



its design rate is always finite as long as the original ensemble 
has finite maximum degrees. As a consequence, by (124), the 
number of variable and check nodes in the residual graph are 
of the same order. 

B. Proof of Lemma 2 

Let (A(x),p(x)) define an LDPC ensemble of positive 
design rate and finite maximum degrees. Let C be a sample 
from such ensemble that is used for transmission over a 
BEC(e). Any channel realization y G {0, ?, 1}^ gives rise to a 
residual parity matrix H sampled from (\{x)^p{x) \ in (127)- 
(129). 

As discussed above, the residual ensemble after removing 
the known bits has finite rate and, therefore, row(H) and 
col(H) are of the same order. The TEP decoder works over 
the initialized graph by removing one column and one row per 
iteration and, thus, the sequence of residual parity matrices Ht 
for t = 1, 2, . . . , satisfies the same property. 

By removing degree-one or degree-two check nodes, on the 
one hand we know that the average check degree ©avglnt ^^^^ 
not grow. On the other hand, Aavglg^ might grow as we remove 
degree-two check nodes. To show that the latter cannot grow 
unbounded, note that ©avgln^ ^avgln^ linked by E{t), 
i.e., the number of edges in the residual matrix Ht! 



A; 



col I 



e 



avglHt' 



(h, j row (h 

Equation (130) proves that, if E{t) > and col (Ut^ 



(130) 



and 

row (^t^ of the same order, then Aavgln^ and ©avgln^ have 
to keep the same order as well, which proves the lemma. 

Appendix D 

tep complexity per iteration. proof of lemma 3 

A Step of the PD algorithm, sunmiarized in the linear 
transformation in (97), has a constant complexity and the PD 
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(a) (b) 

Fig. 13. Most likely scenarios when lavg{t) — ^ oo and we are about to 
process a degree-two check node. 

overall complexity is C^(n). By crossing (97) and (99), we 
observe that the removal of a degree-two check node in a step 
of the TEP is performed by a basic PD operation followed 
by the summation of two columns of the code parity check 
matrix H. The cost of this operation is given by the number 
of ones in both columns. For LDPC codes, once we select a 
degree-two check node, the two associated variable nodes have 
in average degree /avg(^), where /avg(^) is the average edge left 
degree defined in (57). 

In Section IV-D, we show that there might exist codes for 
which, in the limit n ^ oo the average edge left degree /avg(^) 
diverges to oo at some point of the TEP decoding process. 
However, the conditions are too restrictive and we can say 
that, for most of the ensembles, /avg(^) is bounded during the 
whole decoding process. In this case, TEP decoder iteration 
just adds a constant complexity to the PD cost per iteration. 

For those ensembles for which /avg(^) may diverge at some 
point, the complexity of the TEP decoder remains linear as 
long as we only focus on degree-one nodes. If there are not 
any of them left and /avg(^) ^ oo, an eventual iteration of 
quadratic cost (with n) is required. However, we expect the 
fraction of such iterations throughout the decoding process to 
be low for several reasons. First, since the variable average 
degree Aavg is always bounded, then the fraction of variable 
nodes with infinite degree is small compared to the rest of the 
code. And second, each time we remove a degree-two check 
node and /avg(^) ^ oo, we face very high probability one of 
the following two scenarios depicted in Fig. 13. In the scenario 
(a), the probability that the two variables share an additional 
check nodes is close to one and then degree-one check nodes 
can be created. If this happens, /avg(^) is of the same order and 
it is very likely that those recently created check nodes are 
connected to variables of infinite degree. By (56), the removal 
of such nodes starts a process of massive creation of degree- 
one check nodes that brings down /avg(^)- In Fig- 13 (b), we 
have a double connection that can be removed at no cost. 

Appendix E 
Computation of the 7tep parameter 

We compute 7tep, based on the TEP solution independency 
of the processing order proven in Appendix A: 

• Start the TEP algorithm by running a BP stage. Compute 
the BP residual graph expected DD at e = eep [16]. 
Alternatively, we can obtain such DD by evaluating the 
differential equations for the TEP in (78)-(80) if we set 
Pci^) = 1 in (52) for any r until the fraction of degree- 
one check nodes in the graph cancels. 



• Using this graph as input, evaluate the graph expected 
evolution when we only remove degree-two check nodes: 
solve the system (78)-(80) by setting pc{r) = in (52) 
for all r until the graph runs out of degree-two check 
nodes. 

• 7tep = nri(r',n, ejEp), once we have run out of degree 
two check nodes. 

In Fig. 14, we represent the evolution of nr|^^(r', n, ejEp) 
after following the previous steps for the regular (3, 6) -LDPC 
code and n = 2^^ ^ = 2^^ (□) and n = 2^^ (x). 

At e(r) ^ 0.22, the BP decoder gets stuck and we begin to 
remove degree-two check nodes. In this second phase we can 
notice that nri(r',n, cjep) is independent of n. 
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Fig. 14. We plot the solution for ri (r) computed to estimate ri (r^, n, ctep) 
for a (3, 6) regular code at e = cbp = ctep- The code lengths considered are 
n = (x), n = (A) and n = 2^^ (o). The BP was first run and then 
degree-two check nodes were processed. 




